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Abstract 



The notion of class is ubiquitous in computer science and is central in many formalisms 
for the representation of structured knowledge used both in knowledge representation and 
in databases. In this paper we study the basic issues underlying such representation for- 
malisms and single out both their common characteristics and their distinguishing features. 
Such investigation leads us to propose a unifying framework in which we are able to cap- 
ture the fundamental aspects of several representation languages used in different contexts. 
The proposed formahsm is expressed in the style of description logics, which have been 
introduced in knowledge representation as a means to provide a semantically well-founded 
basis for the structural aspects of knowledge representation systems. The description logic 
considered in this paper is a subset of first order logic with nice computational characteris- 
tics. It is quite expressive and features a no^■el combination of constructs that has not been 
studied before. The distinguishing constructs are number restrictions, which generahze ex- 
istence and functional dependencies, inverse roles, which allow one to refer to the inverse of 
a relationship, and possibly cyclic assertions, which are necessary for capturing real world 
domains. We are able to show that it is precisely such combination of constructs that makes 
our logic powerful enough to model the essential set of features for defining class structures 
that are common to frame systems, object-oriented database languages, and semantic data 
models. As a consequence of the estabHshed correspondences, several significant extensions 
of each of the above formalisms become available. The high expressiveness of the logic we 
propose and the need for capturing the reasoning in different contexts forces us to distin- 
guish between unrestricted and finite model reasoning. A notable feature of our proposal is 
that reasoning in both cases is decidable. We argue that, by virtue of the high expressive 
power and of the associated reasoning capabilities on both unrestricted and finite models, 
our logic provides a common core for class-based representation formahsms. 

1. Introduction 

In many fields of computer science we find formalisms for tlie representation of objects and 
classes (Motschnig-Pitrik &; Mylopoulous, 1992). Generally speaking, an object denotes an 
element of the domain of interest, and a class denotes a set of objects with common char- 
acteristics. We use the term "class-based representation formalism" to refer to a formalism 
that allows one to express several kinds of relationships and constraints (e.g., subclass con- 
straints) holding among the classes that are meaningful in a set of applications. Moreover, 
class-based formalisms aim at taking advantage of the class structure in order to provide 
various information, such as whether a class is consistent, i.e., it admits at least one object, 
whether a class is a subclass of another class, and more generally, whether a given constraint 
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holds between a given set of classes. From the above characterization, it should be clear 
that the formalisms referred to in this paper deal only with the structural aspects of objects 
and classes, and do not include any features for the specification of behavioral properties of 
objects. 

Three main families of class-based formalisms are identified in this paper. The first one 
comes from knowledge representation and in particular from the work on semantic networks 
and frames (see for example Lehmann, 1992; Sowa, 1991). The second one originates in 
the field of databases and in particular from the work on semantic data models (see for 
example Hull & King, 1987). The third one arises from the work on types in programming 
languages and object-oriented systems (see for example Kim & Lochovsky, 1989). 

In the past there have been several attempts to establish relationships among the various 
families of class-based formalisms (see Section 6 for a brief survey) . The proposed solutions 
are not fully general and a formalism capturing both the modeling constructs and the 
reasoning techniques for all the above families is still missing. In this paper we address this 
problem by proposing a class-based representation formalism, based on description logics 
(Brachman &; Levesque, 1984; Schmidt-Schaufi h Smolka, 1991; Donini, Lenzerini, Nardi, 
&; Schaerf, 1996), and by using it for comparing other formalisms. 

In description logics, structured knowledge is described by means of so called concepts 
and roles, which denote unary and binary predicates, respectively. Starting from a set of 
atomic symbols one can build complex concept and role expressions by applying suitable 
constructors which characterize a description logic. Formally, concepts are interpreted as 
subsets of a domain and roles as binary relations over that domain, and all constructs 
are equipped with a precise set-theoretic semantics. The most common constructs include 
boolean operations on concepts, and quantification over roles. For example, the concept 
Person fl Vchild.Male, denotes the set of individuals that are instances of the concept 
Person and are connected through the role child only to instances of the concept Male, 
while the concept 3child denotes all individuals that are connected through the role child 
to some individual. Further constructs that have been considered important include more 
general forms of quantification, number restrictions, which allow one to state limits on the 
number of connections that an individual may have via a certain role, and constructs on 
roles, such as intersection, concatenation and inverse. A description logic knowledge base, 
expressing the intensional knowledge about the modeled domain, is built by stating inclusion 
assertions between concepts, which have to be satisfied by the models of the knowledge base. 
The assertions are used to specify necessary and/or necessary and sufficient conditions for 
individuals to be instances of certain concepts. Reasoning on such knowledge bases includes 
the detection of inconsistencies in the knowledge base itself, determining whether a concept 
can be populated in a model of the knowledge base, and checking subsumption, i.e., whether 
all instances of a concept are necessarily also instances of another concept in all models of 
the knowledge base. 

In this paper we propose a description logic called ALUNI, which is quite expressive 
and includes a novel combination of constructs, including number restrictions, inverse roles, 
and inclusion assertions with no restrictions on cycles. Such features make aluni powerful 
enough to provide a unified framework for frame systems, object-oriented languages, and 
semantic data models. We show this by establishing a precise correspondence with a frame- 
based language in the style of the one proposed by Fikes and Kehler (1985), with the 
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Entity- Relationship model (Chen, 1976), and with an object-oriented language in the style 
of the one introduced by Abiteboul and Kanellakis (1989). More specifically, we identify 
the most relevant features to model classes in each of the cited settings and show that 
a specification in any of those class-based formalisms can be equivalently expressed as a 
knowledge base in aluni. In this way, we are able to identify which are the commonalities 
among the families and which are the specificities of each family. Therefore, even though 
there are specific features of every family that are not addressed by aluni, we are able 
to show that the formalism proposed in this paper provides important features that are 
currently missing in each family, although their relevance has often been stressed. In this 
sense, our unifying framework points out possible developments for the languages belonging 
to all the three families. 

One fundamental reason for regarding ALUNI as a unifying framework for class-based 
representation formalisms is that reasoning in ALUNI is hard, but nonetheless decidable, as 
shown by Calvanese, Lenzerini, and Nardi (1994), Calvanese (1996c). Consequently, the 
language features arising from different frameworks to build class-based representations are 
not just given a common semantic account, but are combined in a more expressive setting 
where one retains the capability of solving reasoning tasks. The combination of constructs 
included in the language makes it necessary to distinguish between reasoning with respect to 
finite models, i.e., models with a finite domain, and reasoning with respect to unrestricted 
models. Calvanese (1996c) devises suitable techniques for both unrestricted and finite model 
reasoning, that enable for reasoning in the different contexts arising from assuming a finite 
domain, as it is often the case in the field of databases, or assuming that a domain can also 
be infinite. In the paper, we discuss the results on reasoning in ALUNI, and compare them 
with other results on reasoning in class-based representation formalisms. 

Summarizing, our framework provides an adequate expressive power to account for 
the most significant features of the major families of class-based formalisms. Moreover, it 
is equipped with suitable techniques for reasoning in both finite and unrestricted models. 
Therefore, we believe that ALUNI captures the essential core of the class-based representation 
formalisms belonging to all three families mentioned above. 

The paper is organized as follows. In the next section we present our formalism and 
in Sections 3, 4, and 5 we discuss three families of class-based formalisms, namely, frame 
languages, semantic data models, and object-oriented data models, showing that their basic 
features are captured by knowledge bases in aluni. The final sections contain a review of 
related work, including a discussion of reasoning in ALUNI and class-based formalism, and 
some concluding remarks. 

2. A Unifying Class-Based Representation Language 

In this section, we present ALUNI, a class-based formalism in the style of description logics 
(DLs) (Brachman & Levesque, 1984; Schmidt-Schaufi & Smolka, 1991; Donini et al., 1996; 
Donini, Lenzerini, Nardi, & Nutt, 1997). In DLs the domain of interest is modeled by means 
of concepts and roles, which denote classes and binary relations, respectively. Generally 
speaking, a DL is formed by three basic components: 

• A description language, which specifies how to construct complex concept and role 
expressions (also called simply concepts and roles), by starting from a set of atomic 
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Construct 


Syntax 


Semantics 


atomic concept 


A 


J±. ^ — Z— 1 


^^toTnip Tipcjitinn 






coiiiiiiictioii 


C-i n a? 


cT n 6f 


disjunction 


Ci UC2 




universal quantification 


yn.c 


{0 1 Vo' . (0, 0') G i?^ ^ 0' G C^} 


number restrictions 




{0 1 ti{o' 1 (0, 0') G i?^} > n}i 




{0 1 tt{o' 1 (0,0') G i?^} < n} 


atomic role 


P 


C X A^ 


inverse role 


P- 


{(o,o')l(o',o)GP^} 



Table 1 : Syntax and semantics of ACUMT 



symbols and by applying suitable constructors. It is the set of allowed constructs that 
characterizes the description language. 

• A knowledge specification mechanism^ which specifies how to construct a DL knowl- 
edge base, in which properties of concepts and roles are asserted. 

• A set of basic reasoning tasks provided by the DL. 

In the rest of the section we describe the specific form that these three components assume 
in ALUNI. 

2.1 The Description Language of ALUNI 

In the description language of ALUNI, called ACIANX, concepts and roles are formed ac- 
cording to the syntax shown in Table 1, where A denotes an atomic concept, P an atomic 
role, C an arbitrary concept expression, R an arbitrary role expression, and n a nonnega- 
tive integer. To increase readability of concept expressions, we also introduce the following 
abbreviations: 

T = AVA -1^4, for some atomic concept A 
_L = y4 n -1^4, for some atomic concept A 
BR = B^^R 
3="P = 3^"Pn3^"P 

Concepts are interpreted as subsets of a domain and roles as binary relations over that 
domain. Intuitively, ^A represents the negation of an atomic concept, and is interpreted 
as the complement with respect to the domain of interpretation. Ci □ C2 represents the 
conjunction of two concepts and is interpreted as set intersection, while Ci U C2 represents 
disjunction and is interpreted as set union. Consequently, T represents the whole domain, 

1. ttS denotes the cardinality of a set S. 
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and _L the empty set. \IR.C is called universal quantification over roles and is used to 
denote those elements of the interpretation domain that are connected through role R only 
to instances of the concept C. B-'^R and 3-"i? are called number restrictions, and impose 
on their instances restrictions on the minimum and maximum number of objects they are 
connected to through role R. P , called the inverse of role P, represents the inverse of the 
binary relation denoted by P. 

More formally, an interpretation I = (A^^, ■'^) consists of an interpretation domain A"^ 
and an interpretation function that maps every concept C to a subset of A-^ and 
every role i? to a subset of A-^ x A-^ according to the semantic rules specified in Table 1. 
The sets and P^ are called the extensions of C and R respectively. 

Example 2.1 Consider the concept expression 

Venrolls. Student n B^^enrolls n 3^^°enrolls n 
Vteaches~.(Prof essor U GradStudent) fl 3~^teaches~ fl 
-lAdvCourse 

specifying the constraints for an object to be a university course. The expression reflects the 
fact that each course enrolls only students, and restrictions on the minimum and maximum 
number of enrolled students. By using the role teaches and the inverse constructor we 
can state the property that each course is taught by exactly one individual, who is either a 
professor or a graduate student. Finally, negation is used to express disjointness from the 
concept denoting advanced courses. ■ 

2.2 Knowledge Bases in ALUNI 

An ALUNI knowledge base, which expresses the knowledge about classes and relations of the 
modeled domain, is formally defined as a triple IC = {A^V^T), where ^ is a finite set of 
atomic concepts, P is a finite set of atomic roles, and T is a finite set of so called inclusion 
assertions. Each such assertion has the form 

A ^ C 

where A is an atomic concept and C an arbitrary concept expression. Such an inclusion 
assertion states by means of the concept C necessary properties for an element of the domain 
in order to be an instance of the atomic concept A. Formally, an interpretation I satisfies 
the inclusion assertion ^ ^ C if A-^ C C"^. An interpretation X is a model of a knowledge 
base K- if it satisfies all inclusion assertions in /C. A finite model is a model with finite 
domain. 

Example 2.1 (cont.) The assertion 

Course ^ Venrolls. Student fl 3^^enrolls fl 3^^°enrolls fl 

Vteaches^. (Prof essor U GradStudent) Fl 3^^teaches^ 

makes use of a complex concept expression to state necessary conditions for an object to 
be an instance of the concept Course. ■ 
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In ALUNI no restrictions are imposed on the form that the inclusion assertions may 
assume. In particular we do not rule out cyclic assertions, i.e., assertions in which the 
concept expression on the right hand side refers, either directly or indirectly via other 
assertions, to the atomic concept on the left hand side. In the presence of cyclic assertions 
different semantics may be adopted (Nebel, 1991). The one defined above, called descriptive 
semantics, accepts all interpretations that satisfy the assertions in the knowledge base, and 
hence interprets assertions as constraints on the domain to be modeled. For inclusion 
assertions, descriptive semantics has been claimed to provide the most intuitive results 
(Buchheit, Donini, Nutt, & Schaerf, 1998). Alternative semantics which have been proposed 
are based on fixpoint constructions (Nebel, 1991; Schild, 1994; De Giacomo & Lenzerini, 
1994b), and hence allow to define in a unique way the interpretation of concepts. 

In general, cycles in the knowledge base increase the complexity of reasoning (Nebel, 
1991; Baader, 1996; Calvanese, 1996b) and require a special treatment by reasoning proce- 
dures (Baader, 1991; Buchheit, Donini, & Schaerf, 1993). For this reason, many DL based 
systems assume the knowledge base to be acyclic (Brachman, McGuinness, Patel-Schneider, 
Alperin Resnick, & Borgida, 1991; Bresciani, Franconi, & Tessaris, 1995). However, this as- 
sumption is unrealistic in practice, and cycles are definitely necessary for a correct modeling 
in many application domains. Indeed, the use of cycles is allowed in all data models used 
in databases, and, as shown in the following sections, in order to capture their semantics in 
ALUNI the possibility of using cyclic assertions is fundamental. 

Besides inclusion assertions, some DL based systems also make use of equivalence as- 
sertions (Buchheit et al., 1993), which express both necessary and sufficient conditions for 
an object to be an instance of a concept. Although this possibility is ruled out in ALUNI, 
this does not limit its ability of capturing both frame based systems and database models, 
where the constraints that can be expressed correspond naturally to inclusion assertions. 

2.3 Reasoning in ALUNI 

The basic tasks we consider when reasoning over an aluni knowledge base are concept 
consistency and concept subsumption: 

• Concept consistency is the problem of deciding whether a concept C is consistent in 
a knowledge base /C (written as /C ^ C ^ _L), i.e., whether /C admits a model I such 
that / 0. 

• Concept subsumption is the problem of deciding whether a concept Ci is subsumed by 
a concept C2 in a knowledge base )C (written as /C |= Ci ^ C2), i.e., whether Cf C C2 
for each model I of /C. 

The inclusion of number restrictions and inverse roles in ACUAfl and the ability in 
ALUNI of using arbitrary, possibly cyclic inclusion assertions allows one to construct a knowl- 
edge base in which a certain concept is consistent but has necessarily an empty extension 
in all finite models of the knowledge base. Similarly, a subsumption relation between two 
concepts may hold only if infinite models of the knowledge base are ruled out and only finite 
models are considered. 
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K^even = (A,V,T), where 
A = {Number, Even}, 
V = {doubles}, 

and the set T of assertions consists of: 

Number ^ 3doubles~ □ Vdoubles~.Even 

Even < Number n ^-"'"doubles n Vdoubles. Number 



Figure 1: An ALUNi knowledge base with two concepts that are equivalent in all finite 
models 



Example 2.2 Let ICeven be the knowledge base shown in Figure 1. Intuitively, the asser- 
tions in ICeven state that for each number there is an even number which doubles it, and 
that all numbers which double it are even. Each even number is a number, doubles at most 
one number, and doubles only numbers. Observe that for any model I of ICeven-, the univer- 
sal quantifications together with the functionality of doubles in the assertions imply that 
(jEven-^ > j:jNumber'^, while the direct inclusion assertion between Even and Number implies 
that HEven-^ < HNumber-^. Therefore, the two concepts have the same cardinality, and since 
one is a sub-concept of the other, if the domain is finite, their extensions coincide. This 
does not necessarily hold for infinite domains. In fact, the names we have chosen suggest 
already an infinite model of the knowledge base in which Number and Even are interpreted 
differently. The model is obtained by taking the natural numbers as domain, and inter- 
preting Number as the whole domain. Even as the even numbers, and doubles as the set 
{(2n,n) I n > 0}. ■ 

The example above shows that ALUNI does not have the finite model property^ which 
states that if a concept is consistent in a knowledge base then the knowledge base admits 
a finite model in which the concept has a nonempty extension. Therefore, it is important 
to distinguish between reasoning with respect to unrestricted models and reasoning with 
respect to finite models. We call (unrestricted) concept consistency (written as IC ^„ C -< 
_L) and (unrestricted) concept subsumption (written as JC |=„ A ^ C) the reasoning tasks 
as described above, i.e., carried out without restricting the attention to finite models. The 
corresponding reasoning tasks carried out by considering finite models only, are called finite 
concept consistency (written as /C ^ j C ^ _L) and finite concept subsumption (written as 
lC\=f A^ C). 

Example 2.2 (cont.) Summing up the previous considerations, we can say that Number is 
not subsumed by Even in ICgvem i-e., ICeven Number ^ Even, but is finitely subsumed, i.e., 
ICeven \= f Number ^ Even. Equivalently Number fl-iEven is consistent m ICeven-, i-e-, ICeven ^« 
Number n -1 Even < _L, but is not finitely consistent, i.e., ICeven |=/ Number fl-i Even ^ -L. ■ 

A distinguishing feature of ALUNI is that reasoning both in the finite and in the un- 
restricted case is decidable. In particular, unrestricted concept satisfiability and concept 
subsumption are decidable in deterministic exponential time (De Giacomo & Lenzerini, 
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1994a; Calvanese et al., 1994), and since reasoning in strict sublanguages of ALUNI is al- 
ready EXPTIME-hard (Calvanese, 1996c), the known algorithms are computationally opti- 
mal. Finite concept consistency in ALUNI is also decidable in deterministic exponential time 
while finite concept subsumption (in the general case) is decidable in deterministic double 
exponential time (Calvanese, 1996c). A more precise discussion on the methods for reason- 
ing in ALUNI is provided in Section 6.2, while a detailed account of the adopted algorithms 
and an analysis of their computational complexity is presented by Calvanese (1996c). 

In the next sections we show how the two forms of reasoning with respect to unrestricted 
and finite models, capture the reasoning tasks that are typically considered in different 
formalisms for the structured representation of knowledge. 

3. Frame Based Systems 

Frame languages are based on the idea of expressing knowledge by means of frames, which 
are structures representing classes of objects in terms of the properties that their instances 
must satisfy. Such properties are defined by the frame slots, that constitute the items of a 
frame definition. Since the 70s a large number of frame systems have been developed, with 
different goals and different features. DLs bear a close relationship with the KL-ONE family 
of frame systems (Woods & Schmolze, 1992). However, here we would like to consider frame 
systems from a more general perspective, as discussed for example by Karp (1992), Karp, 
Myers, and Gruber (1995), and establish the correspondence with ALUNI knowledge bases 
in this context. 

We remark that we are restricting our attention to those aspects that are related to 
the taxonomic structure. Moreover, as discussed below, we consider assertional knowledge 
bases, where intensional knowledge is characterized in terms of inclusion assertions rather 
than definitions. In addition, we do not consider those features that cannot be captured in 
a first-order framework, such as default values in the slots, attached procedures, and the 
specification of overriding inheritance policies. Some of the issues concerning the modeling 
of these aspects in DLs are addressed by Donini, Lenzerini, Nardi, Nutt, and Schaerf (1994), 
Donini, Nardi, and Rosati (1995), within a modal nonmonotonic extension of DLs. 

3.1 Syntax of Frame Based Systems 

To make the correspondence precise, we need to fix syntax and semantics for the frame 
systems we consider. Unfortunately, there is no accepted standard and we have chosen to 
use here basically the notation adopted by Fikes and Kehler (1985), which is used also in 
the KEE^ system. 

Definition 3.1 A frame knowledge base, denoted by J^, is formed by a set oi frame and 
slot names, and is constituted by a set of frame definitions of the following form: 

Frame : F in KB E, 



2. KEE is a trademark of Intellicorp. Note that a KEE user does not directly specify her knowledge base 
in this notation, but is allowed to define frames interactively via the graphical system interface. 
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rrame: Course in KB University 


Frame: BasCourse in KB University 


IVIemberSlot: enrolls 


Superclasses: Course 




l\ /I T"^"i /~\ T* 1 g-tr " ^^11 jtVi r\TT 
iVAtJlllLI tJl OIU li . L dUyil L U Y 


Cardinality.Min: 2 


ValueClass: Professor 


Cardinality.Max: 30 


Frame: Professor in KB University 


MemberSlot: taughtby 


ValueClass: (UNION GradStudent 


Frame: Student in KB University 


Professor) 


Cardinality.Min: 1 


Frame: GradStudent in KB University 


Cardinality.Max: 1 


Superclasses: Student 


Frame: AdvCourse in KB University 


MemberSlot: degree 


ValueClass: String 


Superclasses: Course 


Cardinality.Min: 1 


MemberSlot: enrolls 

ValueClass: (INTERSECTION 


Cardinality.Max: 1 


GradStudent 


Frame: Undergrad in KB University 


(NOT Undergrad)) 


Superclasses: Student 


Cardinality.Max: 20 





Figure 2: A KEE knowledge base 



where E is a frame expression, i.e., an expression formed according to the fohowing syntax: 

E — > Superclasses : Fi,. . . ,Fh 
MemberSlot : 
ValueClass : Hi 
Cardinality.Min : mi 
Cardinality.Max : ni 

MemberSlot : Sk 
ValueClass : 
Cardinality.Min : 
Cardinality.Max : Uk 

F and S denote frame and slot names, respectively, rn and n denote positive integers, and 
H denotes slot constraint, which can be specified as follows: 

H — > F\ 

(INTERSECTION Hi H2) \ 
(UNION Hi H2)\ 
(NOT H) 

■ 

For readers that are familiar with the KEE system, we point out that we omit the 
specification of the sub-classes for a frame present in KEE, since it can be directly derived 
from the specification of the super-classes. 

Example 3.2 Figure 2 shows a simple example of a knowledge base modeling the situation 
at an university expressed in the frame language we have presented. The frame Course 
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represents courses which enroll students and are taught either by graduate students or 
professors. Cardinality restrictions are used to impose a minimiun and maximum number 
of students that may be enrolled in a course, and to express that each course is taught by 
exactly one individual. The frame AdvCourse represents courses which enroll only graduate 
students, i.e., students who already have a degree. Basic courses, on the other hand, may 
be taught only by professors. ■ 



3.2 Semantics of Frame Based Systems 

To give semantics to a set of frame definitions we resort to their interpretation in terms of 
first-order predicate calculus (Hayes, 1979). According to such interpretation, frame names 
are treated as unary predicates, while slots are considered binary predicates. 

A frame expression E is interpreted as a predicate logic formula E{x), which has one 
free variable, and consists of the conjunction of sentences, obtained from the super-class 
specification and from each slot specification. In particular, for the super-classes Fi,. . . ,Fh 
we have: 

Fi{x) A . . . A Fh{x) 



and for a slot specification 



MemberSlot : S 
ValueClass : H 
Cardinality.Min : m 
Cardinality.Max : n 



we have 



yy.iS{x,y)^H{y)) A 

3yi, ...,yni- ((Ai^j Vi / Vj) A S{x, yi) A ■ ■ ■ A S{x, ym)) A 
Vyi,. . ■ {{S{x,yi) A ■ ■ ■ A S{x,yn+i)) Mi^jyi = yj), 

under the assumption that within one frame definition the occurrences of x refer to the same 
free variable. Finally the constraints on the slots are interpreted as conjunction, disjunction 
and negation, respectively, i.e.: 

(INTERSECTION Hi H2) is interpreted as Hi{x) A H2{x) 
(UNION Hi H2) is interpreted as i?i(2;) V H2{x) 

(NOT H) is interpreted as ^H{x) 

A frame definition 

Frame : F in KB E 

is then considered as the universally quantified sentence of the form 

yx.{F{x) E{x)). 

The whole frame knowledge base JF is considered as the conjunction of all first-order sen- 
tences corresponding to the frame definitions in J^. 

Here we regard frame definitions as necessary conditions, which is commonplace in the 
frame systems known as assertional frame systems, as opposed to definitional systems, 
where frame definitions are interpreted as necessary and sufficient conditions. 
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In order to enable the comparison with our formalisms for representing structured knowl- 
edge we restrict our attention to the reasoning tasks that involve the frame knowledge base, 
independently of the assertional knowledge, i.e., the frames instances. Fikes and Kehler 
(1985) mention several reasoning services associated with frames, such as: 

• Consistency checking, which amounts to verifying whether a frame F is satisfiable 
in a knowledge base. In particular, this involves both reasoning on cardinalities and 
checking whether the filler of a given slot belongs to a certain frame. 

• Inheritance, which, in our case, amounts to the ability of identifying which of the 
frames are more general than a given frame, sometimes called all-super-of (Karp 
et al., 1995). All the properties of the more general frames are then inherited by the 
more specific one. Such a reasoning is therefore based on the more general ability to 
check the mutual relationhips between frame descriptions in the knowledge base. 

These reasoning services are formalized in the first-order semantics as follows. 

Definition 3.3 Let ^ be a frame knowledge base and F a frame in T. We say that F is 
consistent in T if the first-order sentence T t\^x.F{x) is satisfiable. Moreover, we say that 
a frame description E is more general than F va. T \i T \= yx.{F{x) E{x)). ■ 

3.3 Relationship between Frame Based Systems and ALUNI 

The first-order semantics given above allows us to establish a straightforward relationship 
between frame languages and ALUNI. Indeed, we now present a translation from frame 
knowledge bases to ALUNI knowledge bases. 

We first define the function 6 that maps each frame expression into an ACUMT concept 
expression as follows: 

• Every frame name F is mapped into an atomic concept 0[F). 

• Every slot name S is mapped into an atomic role 6[S). 

• Every slot constraint is mapped as follows 

(UNION Hi H2) is mapped into 

(INTERSECTION Hi H2) is mapped into 
(NOT H) is mapped into 

• Every frame expression of the form 

Superclasses: Fi,...,Fh 
MemberSlot : 5i 

Value Class : Hi 

Cardinality.Min : mi 

Cardinality.Max : ni 

MemberSlot : Sj, 
Value Class : H^ 
Cardinality.Min : rrik 
Cardinality.Max : 



e{Hi)ue{H2). 
e{Hi)n6{H2). 
^e{H). 
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K: = {A,V,T), where 

A = {Course, AdvCourse, BasCourse, Professor, Student, GradStudent, Undergrad, String}, 
V = {enrolls, taughtby, degree}, 
and the set T of assertions consists of: 

Course ^ Venrolls. Student n B^^enrolls n 3^-^° enrolls n 

Vtaughtby.(Prof essor U GradStudent) □ 3~^taughtby 

AdvCourse ^ Course □ Venrolls. (GradStudent □ -lUndergrad) □ B-^'' enrolls 

BasCourse ^ Course □ Vtaughtby.Prof essor 

GradStudent Student □ Vdegree. String gree 

Undergrad ^ Student 



Figure 3: The ALUNi knowledge base corresponding to the KEE knowledge base in Figure 2 



is mapped into the class expression 

e{Fi) n---ne{Fh) n 

y6{Si).6{Hi) n 3^™i0(5i) n B^'^'OiSi) n 

ye{Sk).e{Hk) n 3^"^0(5fc) n ^^^'-eiSk). 

This mapping allows us to translate a frame knowledge base into an ALUNI knowledge base, 
as specified in the following definition. 

Definition 3.4 The ALUNI knowledge base B{J^) = {A,V,T) corresponding to a frame 
knowledge base !F is obtained as follows: 

• A consists of one atomic concept 0{F) for each frame name F in J^. 

• V consists of one atomic role 0{S) for each slot name S in JF. 

• T consists of an inclusion assertion 

9{F) ^ 6{E) 

for each frame definition 

Frame : F in KB E 
in J^. m 



Example 3.2 (cont.) We illustrate the translation on the frame knowledge base in Fig 
ure 2. The corresponding ALUNI knowledge base is shown in Figure 3. 
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The correctness of the translation follows from the correspondence between the set- 
theoretic semantics of ALUNI and the first-order interpretation of frames (see for example 
Hayes, 1979; Borgida, 1996; Donini et al., 1996). We can observe that inverse roles are in 
fact not necessary for the formalization of frames. Indeed, the possibility of referring to the 
inverse of a slot has been rarely considered in frame knowledge representation systems (Some 
exceptions are reported in Karp, 1992). Due to the absence of inverse roles the distinction 
between reasoning in finite and unrestricted models is not necessary^. Consequently, all 
the above mentioned forms of reasoning are captured by unrestricted concept consistency 
and concept subsumption in aluni knowledge bases. This is summarized in the following 
theorem. 

Theorem 3.5 Let T he a frame knowledge-base, F be a frame in !F, E be a frame de- 
scription, and 0{T), 6{F), and 6{E) be their translations in ALUNI. Then the following 
hold: 

• F is consistent in T if and only if 6{T) ^„ (^{F) -< _L. 

• E is more general than F in T if and only if 6{J^) |=„ 0{F) -< 6{E). 

Proof. The claim directly follows from the semantics of frame knowledge bases and the 
translation into DLs that we have adopted. □ 

By Theorem 3.5 it becomes possible to exploit the methods for unrestricted reasoning 
on ALUNI knowledge bases in order to reason on frame knowledge bases. Since the problem 
of reasoning, e.g., in KEE is already EXPTIME-complete, we do not pay in terms of com- 
putational complexity for the expressiveness added by the constructs of aluni. In fact, by 
resorting to the correspondence with ALUNI it becomes possible to add to frame systems 
useful features, such as the possibility of specifying the inverse of a slot (Karp, 1992), and 
still retain the ability to reason in EXPTIME. 

4. Semantic Data Models 

Semantic data models were introduced primarily as formalisms for database schema design. 
They provide a means to model databases in a much richer way than traditional data 
models supported by Database Management Systems, and are becoming more and more 
important because they are adopted in most of the recent database design methodologies 
and Computer Aided Software Engineering tools. 

The most widespread semantic data model is the Entity-Relationship (ER) model in- 
troduced by Chen (1976). It has by now become a standard, extensively used in the design 
phase of commercial applications. In the commonly accepted ER notation, classes are called 
entities and are represented as boxes, whereas relationships between entities are represented 
as diamonds. Arrows between entities, called ISA relationships, represent inclusion asser- 
tions. The links between entities and relationships represent the ER- ro/es, to which number 
restrictions are associated. Dashed links are used whenever such restrictions are refined for 
more specific entities. Finally, elementary properties of entities are modeled by attributes, 

3. If we eliminate from ACUMl inverse roles, then the resulting DL has the finite model property. 
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whose values belong to one of several predefined domains, such as Integer, String, or 
Boolean. 

The ER model does not provide constructs for expressing explicit disjointness or disjunc- 
tion of entities, but extensions of the model allow for the use of generalization hierarchies 
which represent a combination of these two constructs. In order to keep the presenta- 
tion simple, we do not consider generalization hierarchies in the formalization we provide, 
although their addition would be straightforward. Similarly, we omit attributes of relations. 

We now show that all relevant aspects of the ER model can be captured in aluni, and 
thus that reasoning on an ER schema can be reduced to reasoning on the corresponding 
aluni knowledge base. Since aluni is equipped with capabilities to reason on knowledge 
bases, both with respect to finite and unrestricted models (see Section 6.2), the reduction 
shows that reasoning on the ER model, and more generally on semantic data models, is 
decidable. 

As in the case of frame-based systems, we restrict our attention to those aspects that 
constitute the core of the ER model. For this reason we do not consider some features, 
such as keys and weak entities, that have been introduced in the literature (Chen, 1976), 
but appear only in some of the formalizations of the ER model and the methodologies for 
conceptual modeling based on the model. A proposal for the treatment of keys in description 
logics is presented by Borgida and Weddell (1997). 

In order to establish the correspondence between the ER model and aluni, we present 
formal syntax and semantics of ER schemata. 

4.1 Syntax of the Entity- Relationship Model 

Although the ER model has by now become an industrial standard, several variants and 
extensions have been introduced, which differ in minor aspects in expressiveness and in 
notation (Chen, 1976; Teorey, 1989; Batini, Ceri, & Navathe, 1992; Thalheim, 1992, 1993). 
Also, ER schemata are usually defined using a graphical notation which is particularly 
useful for an easy visualization of the data dependencies, but which is not well suited for our 
purposes. Therefore we have chosen a formalization of the ER model which abstracts with 
respect to the most important characteristics and allows us to develop the correspondence 

to ALUNI. 

In the following, for two finite sets X and Y we call a function from a subset of X 
to Y an X-laheled tuple over Y . The labeled tuple T that maps Xi ^ X io yi ^ y, for 
« G {1, . . . , A;}, is denoted [xi:yi, . . . ^x^-.y^]. We also write T[xi\ to denote yi. 

Definition 4.1 An ER schema is a tuple S = {Cs, ^s, atts, rels-, cards), where 

• Cs is a finite alphabet partitioned into a set £s of entity symbols, a set As of attribute 
symbols, a set Us of role symbols, a set TZs of relationship symbols, and a set Vs of 
domain symbols; each domain symbol D has an associated predefined basic domain 
D'^^, and we assume the various basic domains to be pairwise disjoint. 

• £s X £s is a binary relation over £s- 

• atts is a function that maps each entity symbol in £s to an ^5-labeled tuple over Vs- 
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• rels is a function that maps each relationship symbol in TZs to an ZY^-labeled tuple 
over Ss- We assume without loss of generality that: 

— Each role is specific to exactly one relationship, i.e., for two relationships 
R,R' G TZs with R / R', if rels{R) = [Ui: Ei, . . . Mk'. Ek] and rels{R') = 
[U{:E[,..., Ulr.E'f^,], then {f/i, . . . , Uk] and {[/{,..., f/^,} are disjoint. 

— For each role U E Us there is a relationship it! and an entity E such that 
rels{R) = [...M:E,...]. 

• cards is a function from £s x TZs x Us to INq x (INq U {oo}) that satisfies the fol- 
lowing condition: for a relationship R G TZs such that rels{R) = \Ui: Ei, . . . ,Uk'- E^]^ 
cards{E,R,U) is defined only if [/ = Ui for some i G and if E -<*g Ei 
(where <*g denotes the reflexive transitive closure of ^5). The first component 
of cardsiE, R,U) is denoted with cmins{E,R,U) and the second component with 
cmaxs{E, R,U). If not stated otherwise, cmins{E, R,U) is assumed to be and 
cmaxs{E, R, U) is assumed to be 00. ■ 

Before specifying the formal semantics of ER schemata we give an intuitive description of 
the components of a schema. The relation ^5 models the ISA-relationship between entities. 
We do not need to make any special assumption on the form of ^5 such as acyclicity 
or injectivity. The function atts is used to model attributes of entities. If for example 
atts associates the ^5-labeled tuple [^1: Integer, A2: String] to an entity E, then E has 
two attributes Ai , A2 whose values are integers and strings respectively. For simplicity we 
assume attributes to be single- valued and mandatory, but we could easily handle also multi- 
valued attributes with associated cardinalities. The function rels associates a set of roles 
to each relationship symbol R, determining implicitly also the arity of R, and for each role 
U in such set a distinguished entity, called the primary entity for U in R. In a database 
satisfying the schema only instances of the primary entity are allowed to participate in 
the relationship via the role U. The function cards specifies cardinality constraints, i.e., 
constraints on the minimum and maximum number of times an instance of an entity may 
participate in a relationship via some role. Since such constraints are meaningful only if 
the entity can effectively participate in the relationship, the function is defined only for 
the sub-entities of the primary entity. The special value 00 is used when no restriction is 
posed on the maximum cardinality. Such constraints can be used to specify both existence 
dependencies and functionality of relations (Cosmadakis Sz Kanellakis, 1986). They are 
often used only in a restricted form, where the minimum cardinality is either or 1 and 
the maximum cardinality is either 1 or 00. Cardinality constraints in the form considered 
here have been introduced already by Abrial (1974) and subsequently studied by Grant 
and Minker (1984), Lenzerini and Nobili (1990), Ferg (1991), Ye, Parent, and Spaccapietra 
(1994), Thalheim (1992). 

Example 4.2 Figure 4 shows a simple ER schema modeling a state of affairs similar to the 
one represented by the KEE knowledge base in Figure 2. We have used the standard graphic 
notation for ER schemata, except for the dashed link, which represents the refinement of 
a cardinality constraint for the participation of a sub-entity (in our case AdvCourse) in a 
relationship (in our case ENROLLING). ■ 
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Figure 4: An ER schema 
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4.2 Semantics of the Entity-Relationship Model 

The semantics of an ER schema can be given by specifying which database states are 
consistent with the information structure represented by the schema. Formally, a database 
state B corresponding to an ER schema S = {Cs, ^5, atts, rels-, cards) is constituted by a 
nonempty finite set A^, assumed to be disjoint from all basic domains, and a function 
that maps 

• every domain symbol D G Vs to the corresponding basic domain , 

• every entity E E £3 to a subset E'^ of A^, 

• every attribute A E As to a set C A^ x Udgd^- ^^-^ , and 

• every relationship R G TZs to a set of W^-labeled tuples over A^. 

The elements of E^, A^ , and R^ are called instances of E, A, and R respectively. 

A database state is considered acceptable if it satisfies all integrity constraints that are 
part of the schema. This is captured by the definition of legal database state. 



Definition 4.3 A database state B is said to be legal for an ER schema S = 
i^Sj ^S: atts, rels, cards), if it satisfies the following conditions: 

• For each pair of entities £^1, £^2 G £s such that Ei ^5 E2, it holds that Ef CE^. 

• For each entity £^ G £^5, if atts{E) = [Ai: Di, . . . , A^: D^], then for each instance 
e G E^ and for each i G {I, . . . ,h} the following holds: 

— there is exactly one element aj G Af whose first component is e, and 

— the second component of is an element of ^ . 

• For each relationship R G TZs such that rels{R) = \Ui: Ei, . . . ,Uk'- Ek\, all instances 
of R are of the form [Ui: ei, . . . , Uk'- e^], where ej G Ef ^ « G {1, . . . , k}. 
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Number 



Even 




Figure 5: The ER schema corresponding to Example 2.2 

• For each relationship R G TZs such that rels{R) = \Ui: Ei, . . . ^Uk- E^]^ for each 
i G {1, . . . , A;}, for each entity E & £s such that E Ei and for each instance e oi E 
in I, it holds that 

cmins{E,R,Ui) < tt{r G R^ \ r[Ui] = e} < cmaxs{E, R,Ui). 



Notice that the definition of database state reflects the usual assumption in databases 
that database states are finite structures (see also Cosmadakis, Kanellakis, &; Vardi, 1990). 
In fact, the basic domains are not required to be finite, but for each legal database state 
for a schema, only a finite set of values from the basic domains are actually of interest. We 
define the active domain Af^^ of a database state B as the set of all elements of the basic 
domains D^^, D G X's, that effectively appear as values of attributes in B. More formally: 

^act = {de D'^^ I G a 3^ G ^s, e G A^ . (e, d) G A^}. 

Since A^ is finite and As contains only a finite number of attributes, which are functional 
by definition, also Af^^ is finite. 

Reasoning in the ER model includes verifying entity satisfiability and deducing inheri- 
tance. Entity satisfiability amounts to checking whether a given entity can be populated in 
some legal database state (Atzeni &: Parker Jr., 1986; Lenzerini & Nobili, 1990; Di Battista 
&: Lenzerini, 1993), and corresponds to the notion of concept consistency in DLs. Deducing 
inheritance amounts to verifying whether in all databases that are legal for the schema, 
every instance of an entity is also an instance of another entity. Such implied ISA relation- 
ships can arise for difi^erent reasons. Either trivially, through the transitive closure of the 
explicit ISA relationships present in the schema, or in more subtle ways, through specific 
patterns of cardinality restrictions along cycles in the schema and the requirement of the 
database state to be finite (Lenzerini &; Nobili, 1990; Cosmadakis et al., 1990). 

Example 4.4 Figure 5 shows an ER schema modeling the same situation as the knowledge 
base of Example 2.2. Arguing exactly as in that example we can conclude that the two 
entities Number and Even denote the same set of elements in every finite database legal for 
the schema, although the ISA relation from Number to Even is not stated explicitly. It is 
implied, however, due to the cycle involving the relationship and the two entities and due 
to the particular form of cardinality constraints. ■ 
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4.3 Relationship between Entity-Relationship Schemata and ALUNI 

We now show that the different forms of reasoning on ER schemata are captured by finite 
concept consistency and finite concept subsumption in ALUNI. The correspondence between 
the two formalisms is established by defining a translation (j) from ER schemata to ALUNI 
knowledge bases, and then proving that there is a correspondence between legal database 
states and finite models of the derived knowledge base. 

Definition 4.5 Let S = {Cs, ^5, atts, rels-, cards) be an ER schema. The ALUNI knowl- 
edge base 4>{S) = {A^ V^T) is defined as follows: 

The set A of atomic concepts of 4>{S) contains the following elements: 

• for each domain symbol D G Vs, an atomic concept 4>{D)\ 

• for each entity E & £s, an atomic concept ^{E); 

• for each relationship R G TZs, an atomic concept ^{R). 
The set V of atomic roles of 0(5) contains the following elements: 

• for each attribute A G As, an atomic role ^{A); 

• for each relationship R G TZs such that rels{R) = [Ui: Ei, . . . ,Uk: E^], k atomic roles 
0(C/i),...,0(C/,). 

The set T of assertions of 0(5) contains the following elements: 

• for each pair of entities Ei,E2 G £s such that Ei ^5 E2, the assertion 

^{E,) ^ ^{E2) (1) 

• for each entity E E £3 such that aUs{E) = [Ai: Di, . . . , Afi'. _D/,], the assertion 

4>{E) ^ y4>{A,).4>{Di)n---ny4>{Ah).4>{Dh)n3-'4>{Ai)n---n3-'4>{Ah) (2) 

• for each relationship R G TZs such that rels{R) = \Ui: Ei, . . . ,Uk- E^], the assertions 

0(i?) ^ V0(C/i).0(£;i)n---nV0(C/fc).0(£;fc)n3=V(C/i)n---n3=V(?7fc) (3) 
4>{Ei) ^ y{(t>{Ui))-.(t>{R), ie {!,..., k} (4) 

• for each relationship R G TZs such that rels{R) = [Ui: Ei, . . . ,Uk'- E^]^ for i G 
{1, . . . , A;}, and for each entity E E £s such that E -<*g E^, 

— if m = cmins{E, R, Ui) ^ 0, the assertion 

m < 3>-"'(m))-- (5) 

— if n = cmaxs{E,R,Ui) ^ oo, the assertion 

(t>{E) ^ 3'^-im))-- (6) 

• for each pair of symbols Xi, X2 G Ss^TZ-s^T^S such that Xi / X2 and Xi G TZsUVs, 
the assertion 

</>(Xi) ^ -0(X2). (7) 
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K: = {A,V,T), where 






A = {Course, AdvCourse, Teacher, Student, GradStudent, TEACHING, ENROLLING, String}, 


V = {Tof, Tby, Kin, Eof, degree}, 


and the set T of assertions consists of: 


TEACHING 




VTof .Course n B^^Tof n 






VTby. Teacher □ 3~^Tby 


ENROLLING 




VEin. Course □ 3~^Ein □ 






VEof .Student n 3=iEof 


Course 




VTof-.TEACHINGn B^^Tof- n 






VEin". ENROLLING n B^^Ein" n B^^^Ein" 


AdvCourse 




Course n 3-2°Ein" 


Teacher 




VTby". TEACHING 


Student 


< 


VEof ".ENROLLING n 3-^Eof~ U B-'^Eof " 


GradStudent 




Student □ Vdegree. String □ 3~^degree. 



Figure 6: The ALUNI knowledge base corresponding to the ER schema in Figure 4 



Example 4.2 (cont.) We illustrate the translation on the ER schema of Figure 4. The 
ALUNI knowledge base that captures exactly its semantics is shown in Figure 6, where for 
brevity the disjointness assertions (7) are omitted, and assertions with the same concept on 
the left hand side are collapsed. ■ 

The translation makes use of both inverse attributes and number restrictions to capture 
the semantics of ER schemata. We observe that, by means of the inverse constructor, a 
binary relationship could be treated in a simpler way by choosing a traversal direction and 
mapping the relationship directly to a role. Notice also that the assumption of acyclicity 
of the resulting knowledge base is unrealistic in this case, and in order to exploit the cor- 
respondence for reasoning in the ER model, we need techniques that can deal with inverse 
attributes, number restrictions, and cycles together. As shown in Example 2.2, the com- 
bination of these factors causes the finite model property to fail to hold, and we need to 
resort to reasoning methods for finite models. 

In fact, we can reduce reasoning in the ER model to finite model reasoning in ALUNI 
knowledge bases. For this purpose we define a mapping between database states corre- 
sponding to an ER schema and finite interpretations of the knowledge base derived from it. 
Due to the possible presence of relations with arity greater than 2, this mapping is however 
not one-to-one and we first need to characterize those interpretations of the knowledge base 
that directly correspond to database states. 

Definition 4.6 Let S = {Cs-, ^5, atts, rels-, cards) be an ER schema and (]){S) be defined 
as above. An interpretation X of (t>{S) is relation- descriptive, if for every relationship R G 
TZs: with rels{R) = \Ui:Ei,..., Uk'- -Efc], for every c?, d' G we have that 

( /\ Vd" G . ((d, d") G {4>{Ui)f ^ {d', d") G imyf)) ^d = d'. (8) 
l<i<A; 
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Intuitively, the extension of a relationship in a database state is a set of labeled tuples, 
and such a set does not contain the same element twice. Therefore it is implicit in the 
semantics of the ER model that there cannot be two labeled tuples connected through all 
roles of the relationship to exactly the same elements of the domain. In a model of the 
ALUNi knowledge base corresponding to the ER schema, on the other hand, each tuple is 
represented by a new individual, and the above condition is not implicit anymore. It also 
cannot be imposed in aluni by suitable assertions. The following lemma, however, shows 
that we do not need such an explicit condition, when we are interested in reasoning on an 
ALUNI knowledge base corresponding to an ER schema. This is due to the fact that we can 
always restrict ourselves to considering only relation-descriptive models. 

Lemma 4.7 Let S be an ER schema, (f){S) be the ALUNI knowledge base obtained from S 
according to Definition 4-5, andC be a concept expression of(j){S). If C is finitely consistent 
in 4>{S), then there is a finite relation- descriptive model X of 0(5) such that ^ 0. 

Proof. Let Xq be a finite model of 4>{S) such that ^ 0. We can build a finite relation- 
descriptive model X' by starting from Xq and applying the following construction once for 
each relationship in TZs- 

Let X be the model obtained in the previous step and let R G TZs with rels{R) = 
\Ui: El, . . . ,Uk: E^] be the next relationship to which we apply the construction. We con- 
struct from X a model Xr such that condition 8 is satisfied for relationship R. 

Given an individual r G {(f){R))^, we denote by Ui{d), i E {!,..., A;} the (unique) 
individual e such that (r, e) G {4){Ui))-^. For G {4>{Ei))-^, i G {!,..., k) we define 
X{uy.eu-,Uu:eu) = {r ^ (0(^))^ I Ui{d) = Ej, for 2 G We call conflict-set 

a set -^((7i:ei,...,f7s.:ei,) with more than one element. Prom each conflict-set -^((7i:ei,...,?7i.:ei.) 
we randomly choose one individual r, and we say that the others induce a conflict on 
{Ui: ei, . . . , Uk'- e/j). We call Conf the (finite) set of all objects inducing a confiict on some 
{Ui:ei,. . . ,Uk:ek). 

We define an interpretation X2Conj as the disjoint union of 2f"^''"-^ copies of X, one copy, 
denoted by Xz, for every set Z G 2'^°^^ . We denote by dz the copy in Xz of the individual 
d in X. Since the disjoint union of two models of an ALUNI knowledge base is again a 
model, Xr^conf is a model of (f){S). Let Xz and Xzi be two copies of X in X^conj . We call 
exchanging Uk{rz) with Uk{rz') the operation onX2Con/ consisting of replacing in {(t>{Uk))^^ 
the pair {rz,Uk{rz)) with {rz,Uk{rz')) and, at the same time, replacing in {(p{Uk))-^^' the 
pair {rz',Uk{rz')) with {rz' ,Uk{rz))- Intuitively, by exchanging Uk{rz) with Uk{rz'), the 
individuals rz and rz' do not induce conflicts anymore. 

We construct now from X2Conf an interpretation Xr as follows: For each r G Conf and 
for each Z G 2*^°"-/^ such that r G 2', we exchange Uk{rz) with Ui..{rz\{r})- It is possible 
to show that all conflicts are thus eliminated while no new conflict is created. Hence, in 
Xr, condition 8 for R is satisfied. We still have to show that Xr is a model of (f){S) in 
which C^'^ / 0. Indeed, it is straightforward to check by induction that for every concept 
expression C" appearing in (j){S), for all Z G 2'^°"-/', d G C'^ if and only if dz G C"^«. Thus 
all assertions of 4>{S) are still satisfied in Xr and C^^ / 0. □ 
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With this result, the following correspondence between legal database states for an 
ER schema and relation-descriptive models of the resulting ALUNI knowledge base can be 
established. 

Proposition 4.8 For every ER schema S = {Cs, atts, rels, cards) there exist two 
mappings as, from database states corresponding to S to finite interpretations of its transla- 
tion 4'{S), and (5s, from finite relation- descriptive interpretations of (j)[S) to database states 
corresponding to S, such that: 

1. For each legal database state B for S, ols{B) is a finite model of 4>{S), and for each 
symbol X G ^5 U yls U 7^s U Vs, = {(^{X))'^s{B) _ 

2. For each finite relation-descriptive model I of(f){S), f5s{I) is a legal database state for 
S, for each entity E G £s, {4>{E))^ = E>^s^^\ and for each symbol X G AsUTZsUTis, 

Proof. (1) Given a database state B, we define the interpretation I = as{B) of (f){S) as 
follows: 

• For each symbol X G ^5 U U 7^s U X)s, 

{HX)f = X^ (9) 

• For each relationship R G TZs such that rels{R) = [Ui: Ei, . . . ,Uk- E^], 

{^{Ui)f = {{r,e) e X \r e R'^, andr[Ui] = e}, ie{l,...,k}. (10) 

Let ;B be a legal database state. To prove claim (1) it is sufficient to show that I satisfies 
every assertion in 4>{S). Assertions 1 are satisfied since B satisfies the set inclusion between 
the extensions of the corresponding entities. With respect to assertions 2, let £^ G £^5 be an 
entity such that atts{E) = [Ai: Di, . . . , A^: D^], and consider an instance e G {(f){E))-^. We 
have to show that for each i G {1, . . . , /i}, there is exactly one element e-j G A-^ such that 
(e, Cj) G (0(j4j))-^, and moreover that G By 9, e G E'^, and by definition of legal 

database state there is exactly one element G Af = {4>{Ai))-^ whose first component is e. 
Moreover, the second component of is an element of Df'^ = {(f){Di))-^. With respect 
to assertions 3, let R G TZs be a relationship such that rels{R) = [Ui: E\. . . . ,Uk'- Ei], 
and consider an instance r G {4){R))-^. We have to show that for each i G {1,...,/k} 
there is exactly one element G A-^ such that (r, ej) G ((^(t/j))-^, and that moreover 
Si G {(f){Ei))-^. By 9, r G R^, and by definition of legal database state, r is a labeled tuple 
of the form [Ui: e'l, . . . , Uk'- e^, where e[ G Kf , i G {1, . . . , A;}. Therefore r is a function 
defined on {C/i, . . . , C/^}, and by 10, is unique and equal to e\. Moreover, again by 9, 
Ci G {4>{Ei))-^ = Ef . Assertions 4 are satisfied, since by 10 the first component of each 
element of {(/){Ui))-^ is always an element of R^ = {(f){R)Y' . With respect to assertions 5, 
let R G TZs be a relationship such that rels{R) = \Ui: Ei, . . . .Uk'- E^], let £^ G £'5 be an 
entity such that E ^5 Ei, for some i G {1, . . . , k}, and such that rn = cmins{E, i?, Ui) 7^ 0. 
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Consider an instance e G {^{E))-^. We have to show that there are at least m pairs in 
{(p{Ui))-^ that have e as their second component. Since assertions 4 are satisfied we know 
that the first component of all such pairs is an instance of By 9 and by definition 

of legal database state, there are at least m labeled tuples in i?^ whose Ui component is 
equal to e. By 10, {4>{Ui))-^ contains at least m pairs whose second component is equal to 
e. With respect to assertions 6 we can proceed in a similar way. Finally, assertions 7 are 
satisfied since first, by definition the basic domains are pairwise disjoint and disjoint from 

and from the set of labeled tuples, second, no element of is a labeled tuple, and 
third, labeled tuples corresponding to different relationships cannot be equal since they are 
defined over different sets of roles. 

(2) Let X be a finite relation-descriptive interpretation of (p{S). For each basic domain 
D e Vs, let (3^ be a function from A^ to D'^'^ that is one-to-one and onto. Since A^ 
is finite and each basic domain contains a countable number of elements, such a function 
always exists. In order to define Ps{1-) we first specify a mapping /3a that associates to 
each individual d G A-^ an element as follows: 

• If d G {(t){E))^ for some entity E G £s, then /3a (c?) = d. 

• If d G {(f){R))-^ for some relationship R G TZs with rels{R) = [Ui: Ei, . . . ,11^: Ej,], and 
there are individuals di, G such that {d, di) G {4>{Ui))^ ^ for i G {1, . . . , A;}, 
then (5^{d) = [Ur.di, . . . ,Uk: 4]- 

• If c? G {4>{D))-^ for some basic domain D G X's, then f3A{d) = f3^{d). 

• Otherwise /3a ((i) = d. 

For a pair of individuals (c?i,c?2) G x A-^, /3a((c?i, c?2)) = {^Aidi), (3A{d2)), and for a set 
X, Pa{X) = {/3a (2;) \ xeX}. 

If X is a model of <^(5) the above rules define /3a (c?) for every d G A-^. Indeed, by 
assertions 7, each d G A-^ can be an instance of at most one atomic concept corresponding 
to a relationship or basic domain, and if this is the case it is not an instance of any atomic 
concept corresponding to an entity. Moreover, if d G {(f){R))^ for some relationship R G TZg 
with rels{R) = [Ui: Ei, . . . , Uk- E^], then by assertions 3, for each i G {1, . . . , A;} there is 
exactly one element di G A-^ such that {d,di) G {4){Ui))^ . If X is not a model of 0(5) and 
for some d G A-^, f3A{d) is not uniquely determined, then we choose nondeterministically 
one possible value. 

We can now define the database state B = /3s (X) corresponding to X: 

. A« = A^ \ [[jRens i4>{R) f U Ud^^^ i4>iD) f) ■ 

• For each symbol X e £s Li As Li TZs Li Vs, X^ = /3a((0(X))^). 

It is not difficult to see, that if X is a model of (p{S), then B defined in such a way is a legal 
database state for S with active domain Udg-Ds ^ 
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The following theorem allows us to reduce reasoning on ER schemata to finite model 
reasoning on ALUNi knowledge bases. 

Theorem 4.9 Let S be an ER schema, E, E' he two entities in S, and 4>{S) he the trans- 
lation of S. Then the following holds: 

1. E is satisfiahle in S if and only if 4>{S) 4>{^) ^ -L- 

2. E inherits from E' in S if and only if 4>{S) |=/ 4>{E) ^ 4>{E'). 

Proof. (1) Let ;B be a legal database state with E'^ / 0. By part 1 of Proposition 4.8, 

as{B) is a finite model of ^{S) in which ((/)(E))°-s(B) ^ 0. 

"<^" Let ^{E) be finitely consistent in (f){S). By Lemma 4.7 there is a finite relation- 
descriptive model X of (piS) with (f>{E)-^ / 0. By part 2 of Proposition 4.8, (3s{I) is a 
database state legal for S in which E^ ^ 0. 

(2) Let (t){S) 4){E) < (j){E'). Then 4){E)r\^4,{E') is finitely consistent in 4){S). 

By Lemma 4.7 there is a finite relation-descriptive model X of (t){S) with d G {4>{E)y- and 
d {(f){E'))-^, for some d G A^. By part 2 of Proposition 4.8, l3s{I) is a database state legal 
for S in which d G E^ and d ^ E'^. Therefore E does not inherit from E'. 

"<;=" Assume E does not inherit from E'. Then there is a database state B legal 
for S where for an instance e E E^ we have e E'^. By part 1 of Proposition 4.8, 
as{B) is a finite model of 0(5) in which e G (^(E;))"^!^) and e ^ Therefore 

Theorem 4.9 allows us to effectively exploit the reasoning methods that have been devel- 
oped for ALUNI in order to reason on ER schemas. The complexity of the resulting method 
for reasoning on ER schemata is exponential. Observe however, that the known algorithms 
for reasoning on ER schemata are also exponential (Calvanese Sz Lenzerini, 1994b), and 
that the precise computational complexity of the problem is still open. 

Moreover, by exploiting the correspondence with ALUNI, it becomes possible to add to 
the ER model (and more in general to semantic data models) several features and modeling 
primitives that are currently missing, and which have been considered important, and fully 
take them into account when reasoning over schemata. Such additional features include for 
example the possibility to specify and use arbitrary boolean combinations of entities, and 
to refine properties of entities along ISA hierarchies. 

5. Object-Oriented Data Models 

Object-oriented data models have been proposed with the goal of devising database for- 
malisms that could be integrated with object-oriented programming systems (Kim, 1990). 
They are the subject of an active area of research in the database field, and are based on 
the following features: 

• They rely on the notion of object identifier at the extensional level (as opposed to 
traditional data models which are value-oriented) and on the notion of class at the 
intensional level. 
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• The structure of the classes is specified by means of typing and inheritance. 

As in the previous section, we present the common basis of object-oriented data models 
with other class-based formalisms by introducing a language for specifying object-oriented 
schemata and show that such schemata can be correctly represented as aluni knowledge 
bases. In our analysis, we concentrate our attention on the structural aspects of object- 
oriented data models. One of the characteristics of the object-oriented approach is to provide 
mechanisms for specifying also the dynamic properties of classes and objects, typically 
through the definition of methods associated to the classes. Those aspects are outside the 
scope of our investigations. Nevertheless, we argue that general techniques for schema level 
reasoning, in particular, type consistency and type inference, can be profitably exploited for 
restricted forms of reasoning on methods (Abiteboul, Kanellakis, Ramaswamy, &; Waller, 
1992). 

5.1 Syntax of an Object-Oriented Model 

Below we define a simple object-oriented language in the style of most popular models 
featuring complex objects and object identity. Although we do not refer to any specific 
formalism, our model is inspired by the ones presented by Abiteboul and Kanellakis (1989), 
Hull and King (1987). 

Definition 5.1 An object-oriented schema is a tuple S = (C5, y^.^, P^), where: 

• C5 is a finite set of class names, denoted by the letter C. 

• As is a finite set of attribute names, denoted by the letter A. 

• Vs is a finite set of class declarations of the form 

Class C is-a Ci, . . . , type-is T, 

in which T denotes a type expression built according to the following syntax: 

T ^ C\ 

Union Ti,...,Tfc End | 
Set-of T I 

Record A^.Ti, ...,Ak:Tk End . 
Vs contains exactly one such declaration for each class C & Cs- ■ 



Example 5.2 Figure 7 shows a fragment of the object-oriented schema corresponding to 
the KEE knowledge base of Figure 2. ■ 

Each class declaration imposes constraints on the instances of the class it refers to. The 
is-a part of a class declaration allows one to specify inclusion between the sets of instances of 
the involved classes, while the type-is part specifies through a type expression the structure 
assigned to the objects that are instances of the class. 
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Class Teacher type-is 

Union Professor, GradStudent 

End 



Class Course type-is 

Record 

enrolls: Set-of Student 
taughtby: Teacher 

End 



Class GradStudent is-a Student type-is 
Record 

degree: String 
End 



Figure 7: An object-oriented schema 



5.2 Semantics of an Object-Oriented Model 

The meaning of an object-oriented schema is given by specifying the characteristics of an 
instance of the schema. The definition of instance makes use of the notions of object 
identifier and value. 

Let us first characterize the set of values that can be constructed from a set of symbols, 
called object identifiers. Given a finite set O of symbols denoting real world objects, the set 
Vo of values over O is inductively defined as follows: 

• O C Vo. 

• Uvi,...,Vk eVo then {\vi, . . . ,Vk\} eVo- 

• Uvi,...,Vk eVo then [[^i: wi, . . .,Ak:Vk]\ G Vo- 

• Nothing else is in Vo- 

A database instance J oi a. schema S = {Cs^As^T^s) is constituted by 

• a finite set O'^ of object identifiers; 

• a mapping tt'^ assigning to each class in Cs a subset of 

• a mapping assigning a value in Voj to each object in O'^ . 

Although the set Vqj of values that can be constructed from a set O'^ of object identifiers 
is infinite, for a database instance one needs only to consider a finite subset of Vqj. 

Definition 5.3 Given an object-oriented schema S and an instance J of 5, the set Vj of 

active values with respect to J is constituted by: 

• the set of object identifiers. 

• the set of values assigned by p'^ to the elements of O*^, including those values that 
are not explicitly associated with object identifiers, but are used to form other values. 

■ 

The interpretation of type expressions in J is defined through an interpretation func- 
tion -'^ that assigns to each type expression a subset of Vqj such that the following condi- 
tions are satisfied: 



TT' 



{C) 
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(Union Ti, . . . , End)^ = T/ U ■ ■ ■ U T/ 

f Set-of T)^ = {{\vi,...,Vk\}\k>{),Vi&T^, ioxi k}} 

(Record ^1 A :Tfc End) = {[^i: «i, . . . , ^z,: u/,]] | /i > A;, 

GTf, for i G {1,...,A;}, 
Vj G Vqj, for j G {A; + 1, . . . , h}}. 

Notice that the instances of type record may have more components than those specified in 
the type of the class. Thus we are using an open semantics for records, which is typical of 
object-oriented data models (Abiteboul &; Kanellakis, 1989). 

In order to characterize object-oriented data models we consider the instances that are 
admissible for the schema. 

Definition 5.4 Let S = {Cs,As-,'Ds) be an object-oriented schema. A database instance 
J7 of 5 is said to be legal (with respect to S) if for each declaration 

Class C is-a Ci, . . . , C„ type-is T 

in Vs, it holds that C-^ C Cf for each i G {1, . . . , n}, and that p-^ {C'^) C T-^. ■ 

Therefore, for a legal database instance, the type expressions that are present in the 
schema determine the (finite) set of active values that must be considered. The construction 
of such values is limited by the depth of type expressions. 



5.3 Relationship between Object-Oriented Schemata and aluni 

We establish now a relationship between aluni and the object-oriented language presented 
above. This is done by providing a mapping from object-oriented schemata into ALUNI 
knowledge bases. Since the interpretation domain for aluni knowledge bases consists of 
atomic objects, whereas each instance of an object-oriented schema is assigned a possibly 
structured value (see the definition of Vo), we need to explicitly represent some of the 
notions that underlie the object-oriented language. In particular, while there is a corre- 
spondence between concepts and classes, one must explicitly account for the type structure 
of each class. This can be accomplished by introducing in ALUNI concepts AbstractClass, 
to represent the classes, and RecType and SetType to represent the corresponding types. 
The associations between classes and types induced by the class declarations, as well as the 
basic characteristics of types, are modeled by means of roles: the (functional) role value 
models the association between classes and types, and the role member is used for specifying 
the type of the elements of a set. Moreover, the concepts representing types are assumed to 
be mutually disjoint, and disjoint from the concepts representing classes. These constraints 
are expressed by adequate inclusion assertions that will be part of the knowledge base we 
are going to define. 

We first define the function tp that maps each type expression into an ACIAMX concept 
expression as follows: 

• Every class C is mapped into an atomic concept ip{C). 

• Every type expression Union Ti, . . . ,Tk End is mapped into tp{Ti) U • • • U ip{Tk). 
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• Every type expression Set-of T is mapped into SetType fl Vmember.'0(T). 

• Every attribute A is mapped into an atomic role 'tp{A), and every type expression 
Record Ai:Ti,. . . ,Ak:Tk End is mapped into 

RecType n Vi/)(^i).i/)(Ti) n 3=i-i/j(^i) n ■ ■ ■ n 

Using ij) we define the aluni knowledge base corresponding to an object-oriented schema. 

Definition 5.5 The aluni knowledge base il){S) = {A^V^T) corresponding to the object- 
oriented schema S = {CsiAst'Ds) is obtained as follows: 

• A = {AbstractClass, RecType, SetType} U {^/'(C) | C G Cs}- 

• V = {value, member} U {ip{A) \ A G As}- 

• T consists of the following assertions: 

AbstractClass < 3=^value 
RecType < Vvalue._L 
SetType Vvalue._L n -.RecType 

and for each class declaration 

Class C is-a Ci, . . . , C„ type-is T 

in T>s^ an inclusion assertion 

7/)(C) < AbstractClass n</)(Ci) n ■■• n ?/)(C„) n Vvalue.'i/)(r). 



Prom the above translation we can observe that inverse roles are not necessary for the 
formalization of object-oriented data models. Indeed, the possibility of referring to the 
inverse of an attribute is generally ruled out in such models. However, this strongly limits 
the expressive power of the data model, as pointed out in recent papers (see for example 
Albano, Ghelli, & Orsini, 1991; Cattell, 1994). Note also that the use of number restrictions 
is limited to the value 1, which corresponds to existence constraints and functionality, 
whereas union is used in a more general form than for example in the KEE system. 

Example 5.2 (cont.) We illustrate the translation on the fragment of object-oriented 
schema in Figure 7. The corresponding ALUNI knowledge base is shown in Figure 8. ■ 
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K: = {A,V,T), where 




A = {AbstractClass, RecType, SetType, String, 


Course, 


Teacher, Professor, Student, GradStudent}, 


V = {value, member, enrolls, taughtby, degree}, 


and the set T of assertions consists of: 


Course 




AbstractClass □ 






Vvalue. (RecType n 3=ienrolls n 3=Haughtby n 






Venrolls. (SetType □ Vmember. Student) □ Vtaughtby.Teacher) 


Teacher 




AbstractClass □ Vvalue. (GradStudent U Professor) 


GradStudent 




AbstractClass fl Student fl 






Vvalue. (RecType fl Vdegree. String fl 3^^degree) 


AbstractClass 




value 


RecType 




Vvalue. _L 


SetType 




Vvalue. _L fl -iRecType 



Figure 8: The ALUNi knowledge base corresponding to the object-oriented schema in Fig- 
ure 7 



Below we discuss the effectiveness of the translation tp. First of all observe that the 
ALUNI knowledge base ip{^) resulting from the translation of an object-oriented schema S 
may admit models that do not have a direct counterpart among legal database instances 
of S. More precisely, both an interpretation of ip{S) and a database instance of S can be 
viewed as a directed labeled graph: In the case of an interpretation, the nodes are domain 
individuals and the arcs are labeled with roles. In the case of a database instance, the 
nodes are either object identifiers or active values, and an arc either connects an object 
identifier to its associated value (in which case it is labeled with value), or is part of the 
sub-structure representing a set or record value (in which case it is labeled with member or 
with an attribute, in accordance with the type of the value). In a legal database instance 
of S, a value v is represented by a sub-structure that has the form of a finite tree with v as 
root, set and record values as intermediate nodes, and objects identifiers as leaves. Clearly, 
such a substructure does not contain cycles. Conversely, in a model of ip{S), there may 
be cycles involving only nodes that are instances of SetType and RecType and in which 
all roles are different from value. We call such cycles bad. A model containing bad cycles 
cannot be put directly in correspondence with a legal database instance. Also, due to the 
open semantics of records one cannot adopt a different translation for which bad cycles in 
the model are ruled out. 



Example 5.6 Consider the object-oriented schema S, containing a single class declaration 
Class C type-is Record ai : Record 02 '■ Record 03 : C End End End 
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value 

oi ^ ' - -- -V i;i 




(J RecType 



.ecType 



Figure 9: A model containing cycles 



which is translated to 

C ■< AbstractClass n 

Vvalue. (RecType fl 3=^ai fl Vai. (RecType fl 3=^a2 n Va2-(RecType fl 3=^03 fl Vaa.C))). 

Figure 9 shows a model of 4>{S) represented as a graph. For clarity, we have named the 
instances of C, and hence of AbstractClass, with o and the instances of RecType with 
V. Observe the two different types of cycles in the graph. The cycle involving individuals 
02 5'W3:*^4: ^nd docs uot cause any problems since it contains an arc labeled with value, 
which is not part of the structure constituting a complex value. In fact, U3 represents the 
record value \ai: ^a-i - [03:02]]]]]]. On the other hand, due to the bad cycle involving vi and 
U2, individual vl represents (together with 02 connected via 03 to ui) a record of infinite 
depth. ■ 

We can nevertheless establish a correspondence from finite models of 'i]j{S) possibly 
containing bad cycles to legal instances of the object-oriented schema S. This can be 
achieved by unfolding the bad cycles in a model of '4>{S) to infinite trees. Obviously, the 
unfolding of a cycle into an infinite tree, generates an infinite number of nodes, which 
would correspond to an infinite database state. However, we can restrict the duplication of 
individuals to those that represent set and record values, and thus are instances of SetType 
and RecType. The instances of AbstractClass, instead, are not duplicated in the process 
of unfolding, and therefore their number remains finite. Moreover, since the set of possible 
active values associated with each object identifier is bound by the depth of the schema, we 
can in fact block the unfolding of bad cycles to the finite tree of depth equal to the depth 
of the schema. 

Let us first formally define the depth of an object-oriented schema S. 
Definition 5.7 For a type expression T we define depth{T) inductively as follows: 



The depth of an object-oriented schema S is defined as the maximum of depth{T) for a type 




if T 
if T 
if T 
if T 



C. 

Union Ti,...,Tk End . 



Set-of T'. 

Record At.T^ Ai-iTi- End. 



expression T in 5. 
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as 



Figure 10: The unfolded version of the model in Figure 9 

We can now introduce the notion of unfolding of an ALUNI interpretation. 

Definition 5.8 Let S be an object-oriented schema, ipi^) its translation in ALUNI and X 
a finite interpretation oi 'ip{S). We call unfolded version oil the interpretation obtained 
from I as follows: For each individual v that is part of a bad cycle, unfold the bad cycle 
into an (infinite) tree having v as root, by generating new individuals only for the instances 
of RecType and SetType. For a nonnegative integer m, we call m-unfolded version of X, 
denoted as X|„, the interpretation obtained by truncating at depth m each infinite tree 
generated in the process of unfolding. ■ 



Example 5.6 (cont.) Figure 10 shows the unfolded version of the model in Figure 9. 
Notice that only the bad cycle has been unfolded to an infinite tree, and that all arcs labeled 
with as lead to 02, which is an instance of AbstractClass and has not been duplicated. 

■ 

The correctness oi ■>p{S) is sanctioned by the following proposition. 

Proposition 5.9 For every object-oriented schema S of depth m, there exist mappings: 

1. as from instances of S into finite interpretations of ip{S) and ay from active values 
of instances of S into domain elements of the finite interpretations of ij}{S) such that: 
For each legal instance J of S, as{J) is a finite model of 'ip{S), and for each type 
expression T of S and each v G Vj, v G T'^ if and only if ay{v) G {i]){T))°'^^'^\ 

2. Ps from finite interpretations of ^/'((S) into instances of S and /3v from domain el- 
ements of the m-unfolded versions of the finite interpretations of 'il^{S) into active 
values of instances of S, such that: For each finite model I of 'ij){S), l3s{I) is a legal 
instance ofS, and for each concept ip{T), which is the translation of a type expression 
TofS and each d G A^l"", d G {ip{T)f\"' if and only if (3v{d) G T^sW. 

Proof. (1) Given a database instance J we define an interpretation as{J) of ip{S) as 
follows: 
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• av is a function mapping every element of Vj into a distinct element of A"'^^'-'^^. 
Therefore A"''^'^) is defined as the set of elements av{v) such that v G Vj. Moreover 
we denote with Aj^, A^ec, and Aget the elements of A"'s('^) corresponding to object 
identifiers, record and set values, respectively. 

• The interpretation of atomic concepts is defined as follows: 

{^j;{C))"s{J) = {av{o) I o G tt^{C)}, 

for every 'ip{C) corresponding to a class name C in 5 
AbstractClass"'s('7) = ^.^ 

RecType"-s(^) = Arec 
SetType"5('^) = Aset 



• The interpretation of atomic roles is defined as follows: 

{^{A)rs{J) = {{d,,d2) I die Arec Sinda^'idi) = [[..., ^:av'(^2),...]]}, 

for every '(p{A) corresponding to an attribute name ^ in 5 
member"^^'^) = {(c?i, dg) | di G A.^ and a:^;i(di) = {|. . . , a^i(c?2), . . .|}} 
value"^^'^) = {{dud2) I {a^Hdi),avHd2)) G p^} 

We prove that for each type T and each v G Vj-, v G T'^ if and only if ay{v) G 
(^/'(T))"'s('^). The first part of the thesis then follows from the definition of as{J). The 
proof is by induction on the structure of the type expression. 

Base case: T = C (i.e., T is a class name). If o G C*^ then ay{o) G {il){C))"'^^^\ and 
vice-versa if d G (^/;(C))"s(^) then ay\d) G . 

Inductive case: T = Record Ai:Ti, . . . .Ak'.Tk End and ^/)(T) = RecType fl 
Vi/)(^i).'(/'(Ti) n 3=V(^i) n ••• n Vi/)(^A;)-V'('?fc) n 3^V(^A;)- We assume that v E Tf 
iff av{v) G for ? G {1, . . . , k}, and show that v e iff av{v) G 

Suppose that v G T-^, i.e., v = \[Ai:vi,...,Ah:Vh^ with h > k and G Tf for 
« G {!,..., k}. By induction hypothesis ay{vi) G (^/'(Tj))"'^^'^), for 2 G {1, ... , A;}, and by 
definition of a^, av{v) G Re cType"-? {av{v),av{vi)) G (i/)(Ai))"-s(^) for i G {1,...,A;}, 
and all roles ip{A) corresponding to attribute names are functional. Therefore, ay{v) G 
(i/>(T))"5(^). 

Conversely, suppose that d = a\;{v) G (^/)(T))"'S('^). Then, for each i G {1, . . . , A;} there is 
exactly one di G A"'s(^) such that (c?,c?j) G and moreover c?j G (^/)(Tj))"'s(J'). 

By definition of as we have = [[j4i: iii, . . . , Tl/, : f/,]], with h > k and iij = Q;y^(di), for 
« G {!,..., k}. By induction hypothesis G Tf, for i G {1,...,A;}, and therefore v G 
( Record AiiTi, ...,Ak:Tk EndV ■ 

The cases for T = Union Ti , . . . , T/t End and T = Set-of T' can be treated analogously. 

(2) Given a finite model I of ip{S) of depth m, we define a legal database instance 
as follows: 

• /3v is a function mapping every element of A^l"* into a distinct element of V^g(i) such 
that the following conditions are satisfied: 

- oPs(i) c V^g(x) is the set of elements /3v('^) such that d G AbstractClass-^l™. 
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- If c? G RecType-^l™, {d,di) G {tp{Ai))^^"^ , for i E {1,...,A;}, and there is no 
other individual d' G A-^l™ and attribute A' such that {d,d') G , then 

Pv{d) = Ui-Pv{di),...,Ak:Pv{dk)l 

- If c? G SetType-^l™, {d,di) G member^l™, for i G and there is no 
other individual d' G A'^i"' such that (d, d') G (member)^i™ , then f3v{d) = 
{/3v(di),...,/3v(4)}. 

• For every class name C, 7r^-sW(C) = {/3v((^) I d E ('0(C))^i-}. 

• pfeW = {(cw) I /3v(c?i) = o,/3v(c?2) = and (c?i,c?2) G value^l-}. 

We first prove that for each concept 'iJ'{T), which is the translation of a type expression 
T of S, and each d E A^l-, d E (V'(T))^l- if and only if l3v{d) E TI^s(t) _ The proof is 
by induction on the structure of the concept expression. Again for the inductive part we 
restrict our attention to the case of record types. 

Base case: T = C (i.e., ipiT) is an atomic concept). If c? G {il>{C))^\'^ then I3y{d) G 
C^sW^ and vice-versa if o G C^''^^) then l3y^{o) G {tp{C)f\'^ . 

Inductive case: '0(T) = RecType n Vi/)(^i).V'(Ti) n 3=V(^i) n • • • n y'ip(Ak).ip(Tk) □ 
B^^7p{Ak) and T = Record Ar.Ti, . . . , AkiTk End . We assume that d G {ip{Ti)f\'^ iff 
(3v{d) G Tf^^^\ for « G {1, . . . , k}, and show that d G (^/'(T))^i- iff f3y{d) G T^s(2:)_ 

Suppose that d E (-(/'(T))-^!™ . Then d G RecType-^l™ and for each i G {1,...,A;} there 
is an individual dj such that dj G (■(/'(Tj))"^!™ and {d,di) G . By construction 

(3v{d) = [[tIi: 1)1, . . . , yl/, : f/,]] for some h > k. Moreover, by induction hypothesis j3y{di) G 
T^^^'^'^ and therefore /3v(d) G T^s(i). 

Conversely, suppose that Pv{d) G T^^^^\ i.e., Pv{d) = [^i: wi, . . . , tI/j: w/j]] with h>k 
and G jf'^^^) for i G By induction hypothesis dj = /^^^(wi) G (^'(Ti))^!'", 

for i G {1,...,A;}, and by definition of d G RecType-^l™ and (d, dj) G 
for i G {1,...,A;}. Since all roles tp{A) corresponding to attribute names are functional, 
dG (^/'(T)fi'-. 

It remains to show that for each declaration 

Class C is-a Ci, . . . , C„ type-is T 

in Vs, (a) C^-sW C Cf^(^) for each j G {1, . . . ,n}, and (b) p/35(X)(c/55(X)) g T^-sW. 

(a) follows from the fact that tp{S) contains the assertion ip{C) ^ '/'(Ci) fl ■ ■ ■ fl tp{Cn) 
and from the definition of Tr^^^^\ 

(b) follows from what we have shown above and from the fact that X|„ still satisfies the 
assertion ip{C) ^ AbstractClass Fl Vvalue.'0(T). In fact, for some d G {ip{C))-^ let d' be 
the unique individual such that (d, d') G value-^. Since X is a model of ipiS), d' E {iJj{T))-^. 
We argue that also d' G (</)(T))-^l"' . If d' is not part of a bad cycle in I, then I and 
X|„ coincide on the sub-structure rooted at d' and formed by the individuals reached via 
member and roles corresponding to attributes, and we are done. Otherwise, in such 
sub-structure is expanded into a finite tree. Since by construction the depth of this tree 
is at least depth{T), and the connections between individuals in I are preserved in it 
follows that d' G {'ip{T) f^"^. □ 
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The basic reasoning services considered in object-oriented databases are subtyping 
(check whether a type denotes a subset of another type in every legal instance) and type 
consistency (check whether a type is consistent in a legal instance). Based on Proposi- 
tion 5.9, we can show that these forms of reasoning are fully captured by finite concept 
consistency and finite concept subsumption in ALUNI knowledge bases. 

Theorem 5.10 Let S be an object-oriented schema, T,T' two type expressions in S, and 
ip{S) the translation ofS. Then the following holds: 

1. T is consistent in S if and only if ■>p{S) ^/ '>P{T) ^ -L. 

2. T is a subtype ofT' in S if and only if ipiS) |=/ ?/)(T) ^ ?/)(T'). 

Proof. The proof is analogous to the proof of Theorem 4.9, but it makes use of Proposi- 
tion 5.9 instead of Proposition 4.8. □ 

Again, the correspondence with ALUNI established by Theorem 5.10 allows us to make 

use of the reasoning techniques developed for aluni to reason on object-oriented schemas. 
Observe that reasoning in object-oriented models is already PSPACE-hard (Bergamaschi 
& Nebel, 1994) and thus the known algorithms are exponential. However, by resorting 
to ALUNI, it becomes possible to take into account for reasoning also various extensions 
of the object-oriented formalism. Such extensions are useful for conceptual modeling and 
have already been proposed in the literature (Cattell & Barry, 1997). First of all, the same 
considerations developed for the ER model with regard to the use of arbitrary boolean 
constructs on classes can be applied also in the object-oriented setting, which provides 
disjunction but does not admit any form of negation. Additional features that can be added 
to object oriented models are inverses of attributes, cardinality constraints on set-valued 
attributes, and more general forms of restrictions on the values of attributes. 

6. Related Work 

In this section we briefly discuss recent results on the correspondence between class-based 
formalisms and on techniques for reasoning in ALUNI and in class-based representation 
formalisms. 

6.1 Relationships among Class-Based Formalisms 

In the past there have been several attempts to establish relationships among class-based 
formalisms. Blasius, Hedstiick, and RoUinger (1990), Lenzerini, Nardi, and Simi (1991) 
carry out a comparative analysis of class-based languages and attempt to provide a unified 
view. The analysis makes it clear that several difficulties arise in identifying a common 
framework for the formalisms developed in different areas. Some recent papers address this 
problem. For example, an analysis of the relationships between frame-based languages and 
types in programming languages has been carried out by Borgida (1992), while Bergamaschi 
and Sartori (1992), Piza, Schewe, and Schmidt (1992) use frame-based languages to enrich 
the deductive capabilities of semantic and object-oriented data models. 
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Artale, Cesarini, and Soda (1996) study reasoning in object-oriented data models by 
presenting a translation to DLs in the style of the one discussed in Section 5. However, the 
proposed translation is applicable only in the case where the shema contains no recursive 
class declarations. This limitation is not present in the work by Bergamaschi and Nebel 
(1994), where a formalism derived from DLs is used to model complex objects and an 
algorithm for computing subsumption between classes is provided. 

A recent survey on the application of DLs to the problem of data management has been 
presented by Borgida (1995) . The application to the task of data modeling of reasoning 
techniques derived from the correspondences presented in Sections 4 and 5 is discussed in 
more detail by Calvanese, Lenzerini, and Nardi (1998). 

Recently, there have also been proposals to integrate the object-oriented and the logic 
programming paradigms (Kifer &; Wu, 1993; Kifer, Lausen, & Wu, 1995). These proposals 
are however not directly related to the present work, since they aim at providing mechanisms 
for computing with structured objects, rather than means for reasoning over a conceptual 
(object-oriented) representation of the domain of interest. 

6.2 Reasoning in aluni and in Class-Based Representation Formalisms 

ALUNI is equipped with techniques to reason both with respect to unrestricted and with 
respect to finite models. We briefly sketch the main ideas underlying reasoning in both 
contexts. A detailed account of the reasoning techniques has been carried out by Calvanese 
(1996c). 

6.2.1 Unrestricted Model Reasoning 

We remind that reasoning on a knowledge base with respect to unrestricted models amounts 
to check either concept consistency, i.e., determine whether the knowledge base admits a 
(possibly infinite) model in which a given concept has a nonempty extension, or concept 
subsumption, i.e., determine whether the extension of one concept is contained in the ex- 
tension of another concept in every model (including the infinite ones) of the knowledge 
base. 

The method to reason in aluni with respect to unrestricted models exploits a well known 
correspondence between DLs and Propositional Dynamic Logics (PDLs) (Kozen & Tiuryn, 
1990), which are a class of logics specifically designed to reason about programs. The 
correspondence, which has first been pointed out by Schild (1991), relies on a substantial 
similarity of the interpretative structures of both formalisms, and allows one to exploit the 
reasoning techniques developed for PDLs to reason in the corresponding DLs. In particular, 
since ACIAMX, the description language of aluni, includes the construct for inverse roles, 
for the correspondence one has to resort to converse-PDL, a variant of PDL that includes 
converse programs (Kozen &; Tiuryn, 1990). However, because of the presence of number 
restrictions in ACUAfI which have no direct correspondence in PDLs, we cannot rely on 
traditional techniques for reasoning in PDLs. Recently, encoding techniques have been 
developed, which allow one to eliminate number restrictions from a knowledge base while 
preserving concept consistency and concept subsumption (De Giacomo & Lenzerini, 1994a). 
The encoding is applicable to knowledge bases formulated in expressive variants of DLs, and 
in particular it can be used to reduce unrestricted model reasoning on ALUNI knowledge 
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bases (both concept consistency and concept subsumption) to deciding satisfiability of a 
formula of converse-PDL. Reasoning in converse-PDL is decidable in EXPTIME (Kozen & 
Tiuryn, 1990), and since the encoding is polynomial (De Giacomo & Lenzerini, 1994a) we 
obtain an EXPTIME decision procedure for unrestricted concept consistency and concept 
subsumption in ALUNi knowledge bases. A simplified form of the encoding, which can be 
applied to decide unrestricted concept consistency in ALUNI has also been presented by 
Calvanese et al. (1994). 

6.2.2 Finite Model Reasoning 

We remind that reasoning on a knowledge base with respect to finite models amounts to 
check either finite concept consistency or finite concept subsumption, for which only the 
finite models of the knowledge base must be considered. 

For finite model reasoning, the techniques based on a reduction to reasoning in PDLs 
are not applicable. Indeed, the PDL formula corresponding to an aluni knowledge base 
contains constructs both for converse programs (corresponding to inverse roles) and for 
functionality of direct and inverse programs, and thus is a formula of a variant of PDL 
which does not have the finite model property (Vardi, 1985). However, after encoding 
functionality, one obtains a converse-PDL formula, and since converse-PDL has the finite 
model property (Fischer & Ladner, 1979), this formula is satisfiable if and only if it is 
finitely satisfiable. This shows that the encoding of number restrictions (and in particular 
the encoding of functionality) , while preserving unrestricted satisfiability does not preserve 
finite satisfiability (De Giacomo & Lenzerini, 1994a). 

For finite model reasoning in ALUNI one can adopt a different technique, which is based 
on the idea of separating the reasoning process in two distinct phases (see Calvanese, 1996c, 
for full details). The first phase deals with all constructs except number restrictions, and 
builds an "expanded knowledge base" in which these constructs are embedded implicitly 
in the concepts and roles. In the second phase the assertions involving number restrictions 
are used to derive from this expanded knowledge base a system of linear inequalities. The 
system is defined in such a way that its solutions of a certain type (acceptable solutions) are 
directly related to the finite models of the original knowledge base. In particular, from each 
acceptable solution one can directly deduce the cardinalities of the extensions of all concepts 
and roles in a possible finite model. The proposed method allows one to establish for aluni 
EXPTIME decidability for finite concept consistency and for special cases of finite concept 
subsumption. By resorting to a more complicated encoding one can obtain a 2EXPTIME 
decision procedure for finite concept subsumption in ALUNI in general (Calvanese, 1996a, 
1996c). 

Reasoning with respect to finite models has also been investigated in the context of de- 
pendency theory in databases. As shown by Casanova, Fagin, and Papadimitriou (1984) for 
the relational model, when functional and inclusion dependencies interact, the dependency 
implication problem in the finite case differs from the one in the unrestricted case. While 
the implication problem for arbitrary functional and inclusion dependencies is undecidable 
(Chandra &: Vardi, 1985; Mitchell, 1983), for functional and unary inclusion dependencies 
it is solvable in polynomial time, both in the finite and the unrestricted case (Cosmadakis 
et al., 1990). 
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Consistency with respect to finite models of schemata expressed in an enriched Entity- 
Relationship model with cardinality constraints has been shown decidable in polynomial 
time by Lenzerini and Nobili (1990). Calvanese and Lenzerini (1994b) extend the decid- 
ability result to include also ISA relationships, and Calvanese and Lenzerini (1994a) show 
EXPTIME decidability of reasoning in an expressive object-oriented model. An algorithm 
for computing a refinement ordering for types (the analogue to a concept hierarchy) in the 
framework of the O2 object oriented model in discussed by Lecluse and Richard (1989). 

Reasoning in the strict sublanguage of aluni obtained by omitting inverse roles and 
number restrictions is already EXPTIME- hard (Calvanese, 1996b). Therefore, the known 
algorithms for deciding unrestricted concept consistency and subsumption and finite concept 
consistency are essentially optimal. 

7. Conclusions 

We have presented a unified framework for representing information about class structures 
and reasoning about them. We have pursued this goal by looking at various class-based 
formalisms proposed in different fields of computer science, namely frame based systems 
used in knowledge representation, and semantic and object-oriented data models used in 
databases, and rephrasing them in the framework of description logics. The resulting de- 
scription logic, called ALUNI includes a combination of constructs that was not addressed 
before, although all of the constructs had previously been considered separately. 

The major achievement of the paper is the demonstration that class-based formalisms 
can be given a precise characterization by means of a powerful fragment of first-order logic, 
which thus can be regarded as the essential core of the class-based representation formalisms 
belonging to all three families mentioned above. This has several consequences. 

First of all, any of the formalisms considered in the paper can be enriched with constructs 
originating from other formalisms and treated in the general framework. In this sense, the 
work reported here not only provides a common powerful representation formalism, but 
may also contribute to significant developments for the languages belonging to all the three 
families. For example, the usage of inverse roles in concept languages greatly enhances the 
expressivity of roles, while the combination of ISA, number restrictions, and union enriches 
the reasoning capabilities available in semantic data models. 

Secondly, the comparison of class-based formalisms from the fields of knowledge rep- 
resentation and conceptual data modeling makes it feasible to address the development of 
reasoning tools to support conceptual modeling (Calvanese et al., 1998). In fact, reason- 
ing capabilities become especially important in complex scenarios such as those arising in 
heterogenous database applications and Data Warehousing. This line of work was among 
the motivations for developing systems based on expressive description logics (Horrocks, 
1998; Horrocks & Patel-Schneider, 1999), and has lead to further extending the language of 
description logics to support Information Integration and, more specifically, the conceptual 
modeling of Data Warehouses (Calvanese, De Giacomo, Lenzerini, Nardi, & Rosati, 1998). 
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